A New Visual Speech Recognition Approach for RGB-D Cameras

نویسندگان

  • Ahmed Rekik
  • Achraf Ben-Hamadou
  • Walid Mahdi
چکیده

Visual speech recognition remains a challenging topic due to various speaking characteristics. This paper proposes a new approach for lipreading to recognize isolated speech segments (words, digits, phrases, etc.) using both of 2D image and depth data. The process of the proposed system is divided into three consecutive steps, namely, mouth region tracking and extraction, motion and appearance descriptors (HOG and MBH) computing, and classification using the Support Vector Machine (SVM) method. To evaluate the proposed approach, three public databases (MIRACL-VC, Ouluvs, and CUAVE) were used. Speaker dependent and speaker independent settings were considered in the evaluation experiments. The obtained recognition results demonstrate that lipreading can be performed effectively, and the proposed approach outperforms recent works in the literature for the speaker dependent setting while being competitive for the speaker independent setting.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine for RGB - D Action Recognition

Bilinear Heterogeneous Information Machine for RGB-D Action Recognition Report Title This paper proposes a novel approach to action recognition from RGB-D cameras, in which depth features and RGB visual features are jointly used. Rich heterogeneous RGB and depth data are effectively compressed and projected to a learned shared space, in order to reduce noise and capture useful information for r...

متن کامل

Introduction to the special issue on visual understanding and applications with RGB-D cameras

The prevalence of affordable RGB-D cameras, such as Microsoft's Kinect and ASUS’s Xtion Pro Live Sensors, is driving a revolution of the landscape of computer vision and vision related research. The pixel-level depth and visual (RGB) information provided by a RGB-D camera not only enables robust vision applications but also opens up new research problems and opportunities across a wide range of...

متن کامل

RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments

RGB-D cameras (such as the Microsoft Kinect) are novel sensing systems that capture RGB images along with per-pixel depth information. In this paper we investigate how such cameras can be used for building dense 3D maps of indoor environments. Such maps have applications in robot navigation, manipulation, semantic mapping, and telepresence. We present RGB-D Mapping, a full 3D mapping system tha...

متن کامل

Robust Intrinsic and Extrinsic Calibration of RGB-D Cameras

Color-depth cameras (RGB-D cameras) have become the primary sensors in most robotics systems, from service robotics to industrial robotics applications. Typical consumergrade RGB-D cameras are provided with a coarse intrinsic and extrinsic calibration that generally does not meet the accuracy requirements needed by many robotics applications (e.g., high accuracy 3D environment reconstruction an...

متن کامل

Multimodal Signal Processing and Learning Aspects of Human-Robot Interaction for an Assistive Bathing Robot

We explore new aspects of assistive living on smart human-robot interaction (HRI) that involve automatic recognition and online validation of speech and gestures in a natural interface, providing social features for HRI. We introduce a whole framework and resources of a real-life scenario for elderly subjects supported by an assistive bathing robot, addressing health and hygiene care issues. We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014